Journal: PLOS Computational Biology
Article Title: Machine learning-based prediction reveals kinase MAP4K4 regulates neutrophil differentiation through phosphorylating apoptosis-related proteins
doi: 10.1371/journal.pcbi.1012877
Figure Lengend Snippet: (A) Three-component Gaussian Mixture Models (GMM) were applied to the NeuRGI scores across all 19,288 predicted genes. The solid black line represents the overall score distribution, dashed lines indicate the three probability density functions (PDFs). Intersections of the three PDFs define NeuRGI thresholds for classifying genes into function (red shadow), non-function (blue shadow), and uncertain (middle white) categories. (B) The line plot depicts the overlap of functional transcription factors (TFs) predicted by CellOracle and NeuRGI (red line), and by SCENIC and NeuRGI (blue line). The x-axis represents the top N TFs ranked by CellOracle’s perturbation score or SCENIC regulon activity. The y-axis shows the percentage of overlap between the predicted functional TFs. (C) The boxplot illustrates the transformed feature values for 4,786 functional genes and 4,734 non-functional genes in four feature groups. The p value was calculated using the Student’s t-test. (D) The GO-BP term network illustrates the five main clusters enriched from 4,786 predictive functional genes (see Methods). Each dot represents a GO term and the dot size indicates the enrichment score. The dashed oval indicates GO terms with similar functions. (E) Proportion of regulatory (red) and other (blue) genes across three gene sets. Regulatory genes include enzyme (Enz), membrane protein (MP), RNA binding protein (RBP), and transcription factor (TF). The p value was calculated by the proportion test. Numbers within bars represent the gene counts. (F) Boxplot of NeuRGI scores, color-coded by gene type: Enzyme (Enz), membrane proteins (MP), RNA binding proteins (RBP), Transcription factors (TF), and others. The p value was calculated using the Student’s t-test. (G) The scatter plot shows the impact of in silico knockout of 2,569 predictive functional regulatory genes on the “positive regulation of myeloid cell differentiation” pathway and MAGMA Zscore (neutrophil count). The y-axis represents the -log 10 (p value) on the pathway after gene in silico knockout (higher values indicate greater impact), and the x-axis represents the gene’s effect on the ‘neutrophil count’ trait (higher Zscores indicate greater impact). Different colors represent different categories of genes. Dot size indicates the NeuRGI score, and contour lines show point density. A cutoff (y = 1.8 and x = 3.6) was set based on the contour lines, dividing the scatter plot into four regions. (H) Expression of top 12 genes in different immune cells from ImmuNexUT, including Naïve CD4 T cells (Naïve CD4), Memory CD4 T cells (Mem CD4), T helper 1 cells (Th1), T helper 2 cells (Th2), T helper 17 cells (Th17), T follicular helper cells (Tfh), Fraction II effector regulatory T cells (Fr. II eTreg), Fraction I naïve regulatory T cells (Fr. I nTreg), Fraction III non-regulatory T cells (Fr. III T), Naïve CD8 T cells (Naïve CD8), CD8+ T effector memory CD45RA+ cells (TEMRA CD8), Effector Memory CD8 T cells (EM CD8), Central Memory CD8 T cells (CM CD8), Naïve B cells (Naïve B), Unswitched memory B cells (USM B), Switched memory B cells (SM B), Double Negative B cells (DN B), Plasmablasts (Plasmablast), Natural Killer cells (NK), CD16 positive monocytes (CD16p Mono), Non-classical monocytes (NC Mono), Intermediate monocytes (Int Mono), Classical monocytes (CL Mono), Myeloid dendritic cells (mDC), Plasmacytoid dendritic cells (pDC), neutrophils (Neu), Low-Density Granulocytes (LDG). (I) Expression of 9 genes in neutrophil differentiation of human UCB. These 9 of the top 12 genes dynamically upregulated, including CREBBP , CYP27A1 , JAK2 , IFNGR1 , MAP4K4 , PLCG2 , PTPRC, TIGAR, and TYK2 . We set ‘time cut’ for cells at different differentiation stages, with HSC set as 1 and Neu as 5, and performed linear regression fitting for the expression of all 9 genes. R represents the Pearson correlation coefficient (R), and the p value (p) was calculated by the Student’s t-test. (J) Expression of top 10 genes in single-cell pseudotime analysis of 2,803 neutrophils in mouse bone marrow. All 12 genes except Cyp27a1 and HLA-DQA1 upregulated during neutrophil maturation. (K) The heatmap displays the log 2 (fold change) in the expression of the top 12 genes in neutrophils from patients with 10 immune-related diseases compared to those from healthy individuals, including ANCA-associated vasculitis (AAV), Takayasu arteritis (TAK), Adult-onset Still’s disease (AOSD), Behçet’s disease (BD), Rheumatoid arthritis (RA), Systemic sclerosis (SSc), Idiopathic inflammatory myopathy (Myo), Sjögren’s syndrome (SjS), Mixed connective tissue disease (MCTD), Systemic lupus erythematosus (SLE). The p value was calculated using the Student’s t-test, and * represents statistical significance (p < 0.05). The histogram represents the number of significant diseases for each gene. (L) Bar plot displaying significantly affected pathways after OntoVAE in silico knockout of MAP4K4 in neutrophils. The blue dashed line represents the significance threshold (p = 0.05).
Article Snippet: The blots were probed with the primary antibodies including rabbit anti-MAP4K4 antibody (CST, #3485), rabbit anti-p-STAT5A (Tyr699) antibody (Abcam, # ab32043), rabbit monoclonal (E289) anti-STAT5A antibody (Abcam, #ab32043), rabbit anti-STAT5B antibody (Abcam, #ab30648), rabbit monoclonal (C11C5) anti-p-STAT5 (Tyr694) antibody (CST, #9359), rabbit monoclonal (D2O6Y) anti-STAT5 antibody (CST, #94205), rabbit monoclonal (14C10) anti-GAPDH antibody (Cell Signaling Technology, #2118) in the universal antibody diluent (NCM biotech, #WB500D) for overnight at 4°C, washed three times with TBST, and then incubated with the HRP conjugated goat anti-rabbit antibodies (HuaBio, #HA1001).
Techniques: Functional Assay, Activity Assay, Transformation Assay, Membrane, RNA Binding Assay, In Silico, Knock-Out, Cell Differentiation, Expressing